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Abstract 

Background: The Phosphate transporter 1 {PHT1) gene family has crucial roles in phosphate uptake, translocation, 
remobilization, and optimization of metabolic processes using of Pi. Gene duplications expand the size of gene 
families, and subfunctionalization of paralog gene pairs is a predominant tendency after gene duplications. To date, 
experimental evidence for the evolutionary relationships among different paralog gene pairs of a given gene family 
in soybean is limited. 

Results: All potential Phosphate transporter 1 genes in Glycine max L. {GmPHTl) were systematically analyzed using 
both bioinformatics and experimentation. The soybean PHT1 genes originated from four distinct ancestors prior to 
the Gamma WGT and formed 7 paralog gene pairs and a singleton gene. Six of the paralog gene pairs underwent 
subfunctionalization, and while GmPHTl ;4 paralog gene experienced pseudogenization. Examination of long-term 
evolutionary changes, six GmPHTl paralog gene pairs diverged at multiple levels, in aspects of spatio-temporal 
expression patterns and/or quanta, phosphates affinity properties, subcellular localization, and responses to 
phosphorus stress. 

Conclusions: These characterized divergences occurred in tissue- and/or development-specific modes, or 
conditional modes. Moreover, they have synergistically shaped the evolutionary rate of GmPHTl family, as well as 
maintained phosphorus homeostasis at cells and in the whole plant. 

Keywords: Phosphate transporter 1, Gene duplication, Gene divergence, Phosphorus homeostasis, Evolution, 
Glycine max L. 



Background 

To adapt to challenging environments, plants have 
developed dramatic modifications in morphological, 
physiological, biochemical and molecular processes. 
Gene duplications are widespread in plant genomes, 
having accumulated a wealth of genetic raw materials to 
meet the selection pressures of new environmental con- 
ditions [1-5]. After gene duplications, there are several 
possible fates for duplicated genes (or paralogs), which 
include subfunctionalization through purifying selections 
(Ka/Ks < 1) [6], neofunctionalization through positive se- 
lections (Ka/Ks > 1) [7], pseudogenization [8], and loss in 
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genome (fractionation) [9-11]. Subfunctionalization is 
the predominant paralog outcome following duplications 
[12]. It reduces the fitness cost of gene duplication by 
buffering dosage imbalances, as well as maintaining the 
functional requirements of the ancestral locus [13]. 

In plants, phosphorus is one of three primary mineral nu- 
trients. It is second most limiting macronutrient for optimal 
growth, due to the relatively large amounts of Pi required 
by plants, the limited amount of available phosphorus 
(orthophosphate, Pi), and the poor mobility of phosphorus 
in soil [14,15]. Plant uptake of Pi from soil relies heavily 
upon the phosphate transporter 1 family (PHT1) [16]. 
PHT1 genes code for plasma membrane proteins, which 
contain 12 transmembrane domains. The PHT1 proteins 
are functionally involved in Pi uptake from the soil, Pi 
translocation across plant tissues, and Pi remobiliza- 
tion from senescent organs (Review in [14,17,18]). Homolo- 
gous genes of PHT1 have been identified in a wide range of 
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species, and they share conserved functions in the Pi uptake 
(Additional file 1) [14] and variable Pi affinity [19-22]. 

The soybean has experienced the Gamma whole genome 
triplication (Gamma WGT) -130 to 240 million years ago 
(mya), the legume WGD (Legume WGD, -58 mya), and 
the Glycine WGD in the Glycine lineage {Glycine WGD, 
-13 mya) [23,24]. As a result, in the present soybean gen- 
ome, about 75% of the genes have multiple paralogs 
[23,25]. Approximately 50% of these paralogs were differ- 
entially expressed and underwent subfunctionalization 
[12], possibly contributing to phenotypic variation in poly- 
ploids [3]. 

Many PHT1 genes have functionally identified in many 
plants, but they are studied as an individual. This was a 
limiting step for both further functional characterization 
of PHT1 family as a whole and genetic evolution analysis 
of them in relation to low Pi environment adaptations. 
In this study, 15 GmPHTl (Glycine max PHT1) family 
paralogs from soybean were identified. Based on data from 
spatio-temporal expression profiles, functional character- 
izations in a heterologous yeast system and subcellular lo- 
calizations, we propose fates of paralogs of soybean PHT1 
were subfunctionalization. These results provided a strong 
basis for function analysis and evolution of gene families. 

Results 

Identification, phylogenetic relationship and promoters of 
soybean PHT1 genes 

Fourteen soybean PHT1 genes with full length sequence 
were found [26,27]. In addition, a syntenic analysis using 
the PGDD or CoGe databases identified one potential 
pseudogene (Glymal3gl8420), which had a truncated 
open reading frame length (ORF) of 444 bp. This poten- 
tial pseudogene, which was determined to be masked, 
has not been previously identified [21,26,28]. For an uni- 
fied nomenclature for the soybean PHTls (Additional 
file 2), fifteen soybean PHT1 genes were renamed as 
GLYma;Phtl;l through GLYma;Phtl;15 according to the 
Commission for Plant Gene Nomenclature, and abbrevi- 
ated as GmPHTl; 1 through IS in the following content. 

These 15 PHT1 genes are dispersed across eight chro- 
mosomes and form 7 paralog gene pairs and one 
singleton (Figure 1A). These paralog gene pairs shared 
93.5 - 97.1% in sequence (Additional file 3 A) and similar 
gene structures (Additional file 3B). For example, both 
GmPHTl;8 and GmPHTl;9 have 3 exons, both GmPHTl;3 
and GmPHTl;14 are composed of 2 exons, while the others 
only contain one. However, an extra intron in 5'-UTRs of 
GmPHTl;l, GmPHTl;4 and GmPHTl;S was identified 
(Additional file 3B). 

To identify the phylogenetic relationship of the soy- 
bean PHT1 genes with full length sequence, a neighbor- 
joining tree was reconstructed based on the multiple 
sequence alignment (Additional file 4). The PHT1 family 



was monophyletic [29] and can be grouped into four 
subfamilies in the angiosperm: the subfamily I is com- 
posed of PHT1 genes induced by arbuscular mycorrhizal 
fungi (AMF), subfamily II genes are from both mono- 
and dicotyledonous species, subfamily III are exclusively 
from dicotyledonous species, and subfamily IV are exclu- 
sively from monocotyledonous species [26,30]. Further- 
more, the subfamily II was closely related to the homologs 
from the fungi by high bootstrap value (Additional file 4), 
indicating it was present before the occurrence of ter- 
restrial plant and an older evolutionary lineage. In the 
soybean, fourteen PHT1 genes were clustered into three 
subfamilies: subfamily I, subfamily II and subfamily III 
(Additional file 4) [21,26]. 

The paralog gene pairs promoter region sequence simi- 
larities are lower than their CDS sequences (Additional file 
3A). Many similar ds-acting regulatory DNA elements, 
which are relative to the nodulin, root, flower, leaf, seed, 
abiotic or biotic stress, sugar and hormone (Additional 
file 5A), can be found in promoter regions of 14 PHT1 
genes according to the PLACE results [31]. For example, 
the ds-elements relative to the root-specific (ROOTMO 
TIFTAPOX1) and nodule-specific (NODCON1GM and 
NODCON2GM) are present in 14 soybean PHT1 pro- 
moters (Additional file 5A). Except GmPHTl;9 promoter, 
other soybean PHT1 promoters contain phosphate starva- 
tion responsive ds-element (PIBS). And GmPHTl;9 pro- 
moter did not embody PIBS motif but a variant PIBS 
motif (76% similarity to PIBS) (Additional file 5A). The dif- 
ferences in common ds-elements across these promoter 
regions include both their number and their distance from 
the starting code (Additional file 5A). That indicated the 
number of ds-elements and their distance from the tran- 
scription start sites affected response abilities of PHT1 to 
the environment. 

Soybean PHT1 genes originated from four ancestors prior 
to the Gamma WGT event 

The average synonymous substitution rate (Ks) of hom- 
ologous blocks is a function of genomic evolutionary 
events that occurred since two homologous blocks di- 
verged from a common ancestor [25,32]. The modern 
soybean genome has undergone Gamma WGT (Ks> 
1.5), Legume WGD ( Ks - 0.3 to 1.5) and Glycine WGD 
(Ks - 0 to 0.3) [24,25]. To analyze the soybean PHT gene 
duplication relationship, Medicago truncatula, which ex- 
perienced the Gamma WGT event and Legume WGD 
event [33], was selected as the reference gene order and 
9 Medicago PHTls were identified (Additional file 5B). 

Based on the analysis of homologous genomic regions, 
seven paralog GmPHTl gene pairs fell into 14 syntenic 
blocks and diverged after the Glycine WGD event. This 
was based on each pair of blocks having collinearity, 
with the average Ks ranging from 0.19 to 0.25 (Figure 1A, 
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Gamma WGT Legume- WGD Glycine- WGD Gamma WGT Legume WGO Glycine WGD 

Figure 1 The evolution of the GmPHTI gene family. A, Syntenic relationships among homologous blocks carrying the 15 GmPHTl sequences 
after the Glycine WGD event. Similar colored blocks imply homology, the short red lines within these blocks show the location of GmPHTls, 
and the black oval is the centromere. GmPHTl;! 5 is the pseudogene and the paralog gene of GmPHTl;! 1 was lost. B, The evolutionary model for 
GmP/-/7~7-containing genomic blocks in the process of the soybean genome evolution, indicating GmPHTls originated from four independent 
ancestors. Different backgrounds depict different whole genome duplication events. The colored blocks imply homology based on the average 
Ks values. The paralog gene of GmPHTl;! 1 was lost in the dotted block dotted, although other genes are collinear with the block containing 
GmPHTl;!!. The detailed collinearity relationships are shown in Additional file 5C. 



Additional file 5C). Although the paralog gene of 
GmPHTl; 11 was lost in the syntenic block of chromo- 
some 2, the block containing GmPHTl; 11 underwent 
Glycine WGD events (Figure 1A and Additional file 5C). 
Because the homologous block containing MtPHT4 on 
Chromosome 5 in M. truncatula had a collinearity with 
the block harboring GmPHTl;ll (Additional file 5B). With 
the exception of the WGD duplication, the ancestor of 
two paralog gene pairs, GmPHTl;l/S and GmPHTl;6l 10, 
experienced tandem duplication before the Legume WGD 
event. Because the blocks embodying GmPHTl;l/6 and 
GmPHTl;5l 10 had a collinearity with the block containing 



two tandem genes, MtPHTS and 7, on Chromosome 1 of 
M. truncatula (Additional file 5B). 

Based on the average Ks values for the homologous 
blocks (Additional file 5C), the evolution history of all 
15 PHT1 genes was predicted. These PHT1 genes were 
categorized into four subgroups, GmPHTl A through D 
(Figure IB). For example, GmPHTIA included eight syn- 
tenic blocks, which represented five paralog gene pairs of 
PHT1 genes, GmPHTl;2/7, GmPHTl;4ll5, GmPHTl;l/S, 
GmPHTl;6llO and GmPHTl;3ll4. Both GmPHTIB and 
GmPHTl C contained two syntenic blocks and each con- 
tained one paralog gene pair of PHT1 genes, GmPHTi;8/9 
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Figure 2 The spatio-temporal transcription of GmPHTI. R, H, C, E, U, S, T1, T2, T3, T4, F: the root, hypocotyl, cotyledon, epicotyls, unifoliolate 
leaf, the stem, the first trifoliolate leaf, the second trifoliolate leaf, the third trifoliolate leaf, the fourth trifoliolate leaf and the flower, respectively. 
And PI, P2, and P3: seven, fourteen and twenty one days after the onset of flowering, respectively. The geometric means of GmSKIP16 and 
Gm UNKI transcripts were used as reference transcripts. The values are means of three replicates and each replicate represented a pool from at 
least five plants. Error bars represent SD. 
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and GrnPHTl;12/13, respectively. GmPHTID group was 
made up of two syntenic blocks and contained the Gm- 
PHT1;11 gene. These results indicated the soybean PHT1 
gene family originates from four distinct ancestors, at least 
prior to the Gamma WGT event. 

Divergence of GmPHTI expressions in different tissues 
and in developmental stages 

Duplicated genes typically exhibit increased expression 
divergence, thus gene expression changes shape evolu- 
tionary rates of proteins and re-establish the gene bal- 
ance after duplication [34]. Based on the spatio-temporal 
expression of GmPHTI genes through real time quanti- 
tative RT-PCR (qRT-PCR), most paralog gene pairs co- 
expressed in 16 tissues (Figure 2, Additional file 5D). 
However, two paralog gene pairs demonstrated obvious 
expression pattern differences. GrnPHTl;14, was only 
expressed roots, but the expression of its paralog gene, 
GmPHTl;3, was detected at low level in all samples. And 
the GmPHTl;2 paralog gene, GmPHTI; 7, did not express 
in hypocotyls and epicotyls at the seedling stage. 

The expression levels among paralog gene pairs dem- 
onstrated clear divergence (Figure 3). Among the six 
paralog gene pairs, 80 pairs of co-expression values were 
obtained from 16 tissues. The overall expression levels 
difference were as follows, -24% had between a 1 to 2 
fold change, ~ 54% between a 2 -to 10 fold change, and 
~ 22% had more than a 10-fold change. Moreover, gene- 
biased expression levels between a paralog gene pair were 
observed. For example, one paralog gene pair, GmPHTl;2 
and GmPHTI; 7 displayed expression biased to GmPHTI ;2 
in 15 tissues, only different in the pods after 21 FAD. 



In addition, expressions were biased to GmPHTl;12 be- 
tween the paralog gene pair, GmPHTl;12 and GmPHTl;13 
(Figure 3). 

GmPHTI genes differential response to the Pi stress 

Plant root performance dependents directly on Pi avail- 
ability in soil [35]. Pi stress induces most of the known 
PHTls genes (Review in [14,17,18]). To investigate Gm- 
PHTI responses under low Pi (Pi = 1 uM) stress condi- 
tions, soybean PHT1 gene expressions were evaluated in 
the root, stem and leaf, at the vegetative stage (Figure 4). 
Compared with expressions under high Pi (Pi = 500 uM) 
conditions, all soybean PHT1 genes were up-regulated in 
roots under the low Pi condition. Divergent expressions 
of some paralogs were observed in stem or leaf tissues 
(Figure 4). Except GmPHTl;4, 7 and 72, transcriptions 
of other soyben PHT1 genes were down-regulated in 
stems under low Pi conditions. Additionally, expressions 
of GmPHTl;3 and 8 were induced by the high Pi condi- 
tion in leaves. 

In addition to under the low Pi condition, the re- 
sponses of GmPHTI genes to a series of Pi concentrations 
in the roots were employed to investigate the expres- 
sion divergence of the paralog gene pairs (Figure 5). Com- 
pared with those under the low Pi (Pi = 1 uM) condition, 
expressions of all 13 GmPHTI genes were significantly 
supressed in the roots under the Pi = 10 uM condition 
except GmPHTl;6. And when the external Pi concentra- 
tion was more than 10 uM, the range of expressions were 
very narrow except GmPHTl;2, GmPHTl;3, GmPHTl;12 
and GmPHTI; 13. That indicated most paralog gene pairs 
showed the similar responses to the external Pi in the 
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Figure 3 The transcript divergence of six paralogous pairs of genes within a given tissue in different developmental stages. The upper 
triangles showed the expression of the lift genes of the paralog gene pairs, and the lower triangle the expression of the right genes of the 
paralog gene pairs. The raw average relative expressions were in the Additional file 5D. R, H, C, E, U, S, T1, T2, T3, T4, F: the root, hypocotyl, 
cotyledon, epicotyls, unifoliolate leaf, the stem, the first trifoliolate leaf, the second trifoliolate leaf, the third trifoliolate leaf, the fourth trifoliolate 
leaf and the flower, respectively. And PI, P2, and P3: seven, fourteen and twenty one days after the onset of flowering respectively. The 
geometric means of GmSKIP16 and GmUNKI transcripts were used as the reference transcript. 
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Figure 4 GmPHTI transcription in the response to the low Pi (1 pM) stress. At least five individual plants per treatment were harvested when 
the unifoliolate leaves fully expanded. The transcript abundance of GmPHTI genes in the roots (R), stems (S) and unifoliolate leaves (U) were shown 
using the expression of the 500 pM Pi treatment group as a control. The geometric means o1GmSKIP16 and GmUNKI transcripts were used as the 
reference transcript. The values are means of three replicates and each replicate represented a pool from at least five plants. Error bars represent SD. 
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root except GmPHTl;2/7 and GmPHTl;3l 14. And under 
the high- Pi condition, the transcriptions of GmPHTI ;2 
and GmPHTl;3 were induced, but the strength were lower 
than under the low Pi condition (Figure 5). That suggested 
that GmPHTl;2 and GmPHTl;3 may contributed to Pi 
tolerance in soybean. 

Divergence in the Pi transport activities of GmPHTI in yeast 

To investigate the divergence of 14 soybean PHTls in the 
Pi transport ability, the abilities of heterologous comple- 
mentation of yeast double mutant (PAM2, Apho84 Apho89) 
[36] were tested on the nutrition defect media (Additional 
file 6). The resulting sequences were confirmed by se- 
quencing and cloned into pYES-DEST52 drive by GAL1 
promoter. 

As Figure 6 shown, only one paralog gene pairs, 
GrnPHTl;6/10, showed difference of complementation 
ability. And PAM2 cells carrying GmPHTl;l } GrnPHTl;2, 
GrnPHTl;5, GmPHTl;7 } and GmPHTl;10 grew well on 
the induced modified SD media (the carbon source is 
galactose) under the low Pi condition, whereas PAM2 
cells harboring other 9 GmPHTI and the empty vector 
did not grow normally under the same conditions. This 



data suggested that GmPHTl;l } GmPHTl;2, GmPHTl;5 } 
GmPHTI; 7, and GmPHTI; 10 may be high- affinity phos- 
phate transporters and others were lower-affinity ones. 

Kinetic parameters (Km) can display the affinity ability 
of the PHT1 proteins for transporting Pi. Subsequent 
32 Pi uptake assays were employed to further confirm the 
different affinity of GmPHTI and to analyze K m values 
of Pi uptake of 4 paralogous pair transporters (Figure 7). 
The paralogous gene pairs also displayed divergence on 
the affinity for Pi. For example, GmPHTI; 1 had a K m of 
68.9 uM, while its papralogous transporter GmPHTI ;5 
had a K m of 243.9 uM; the K m of GmPHTl;12 was 
505.1 uM, whereas the K m of GmPHTl;13 was 363.6 uM, 
both of them were low affinity transporters. 

Divergence on the subcellular localization of 
GmPHTI proteins 

The PHT1 protein in plants primarily localizes to the 
plasma membrane [14,20,37]. Under Pi stress they are 
targeted to endocytic compartments [38]. To analyze sub- 
cellular localization of GmPHTI proteins, we tagged the 
GmPHTI proteins with yellow fluorescence protein (YFP) 
at their C-terminal. With the exception of GmPHTl;8 and 
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Figure 5 GmPHTI transcription in the root is affected by external Pi concentration. Roots of at least five individual plants per treatment 
were harvested when the unifoliolate leaves fully expanded. Different paralogous pair genes were plotted in individual figures. The geometric 
means of GmSKIPW and GmUNKI transcripts were used as the reference transcript. The values are means of three replicates and each replicate 
represented a pool from at least five plants. Error bars represent SD. 



GmPHTI; 10, twelve GmPHTI proteins co-localized to 
the plasma membrane (Figure 8). 

One exceptional case was the localization pattern of 
GmPHTl;8. Though no fluorescent signal of GmPHTI; 
8-YFP was detected in the plasma membrane (Figure 9A), 
a strong signals in endoplasmic reticulum (ER) was 
detected. GmPHTl;8 co-localized to the ER in Arabi- 
dopsis mesophyll protoplasts along with ER-marker [39] 
(Figure 9B). 



Another exception was GmPHTI; 10 localization. Here, 
GmPHTl;10 fluorescent signals were detected both at 
the plasma membrane as well as outside of the plasma 
membrane (Figure 9C). These localization patterns were 
recapitulated in plasmolytic onion epidermal cells. Strong 
fluorescence was detected in cell walls, plasma membranes 
and Hechtian strands (Figure 9D). Moreover, paralog gene, 
GmPHTl;6, and another gene pair, GmPHTl;l and 5, pre- 
sented with strong signals only in the plasma membrane 
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Figure 6 Complementation analysis of GmPHTI genes in the yeast double mutant PAM2. Wild type (WT) and mutant (PAM2) yeast strains 
containing either an empty pYES-DEST52 vector (Empty vector) or pYES-DEST52 (bearing one of the GmPHTI sequences). Serial dilutions were 
spotted onto a selective medium supplemented with low concentrations (10 uM) of Pi, with either glucose (Glu, non-induced medium) or 
galactose (Gal, induced medium) as the carbon source. Each spot represented 5 ul yeast culture, diluted from a master culture, as indicated. 
Yellow color indicates the cell in the spots grew well. Images were captured after three days. 



(Additional file 7). This pattern was a copy of that found in 
Medicago MtPT3 [20] and Arabidopsis Phtl;l [37]. 

Discussion 

Different lineages of PHT1 genes undergoing different 
nature selections during the plant evolutionary process 

Land plants only gain phosphorous from soil solutions 
through the root, and most PHT1 genes express and are 
induced by the Pi starvation or by AMF in the root 
[16,40,41], indicating Pi uptake is heavily dependent on 
the phosphate transporter 1 family in the plant. In order 
to adapt the low- Pi environment and improve Pi uptake, 
two Pi uptake pathways have evolved under the nature 
selection. First, the direct Pi acquisition pathway acts 
through modifications of root architecture, root length 
and lateral root numbers [42-44] . The second pathway is 
symbiotic Pi uptake, which acts through plant/fungi in- 
teractions [29,45]. 

Before the occurrence of the first terrestrial plant, 
which were present about 475 million years ago [46], the 
PHT1 subfamily II has been divergent based on the phylo- 
genic tree (Additional file 4), indicating the direct Pi acqui- 
sition pathway was the main Pi acquisition pathway of 
ancient plants. About 460 million years ago, AMF occurred 
and may have played a crucial role in facilitating the 
colonization of land by plants most likely only consisted of 
plants on the bryophytic level [47]. And then the PHT1 
subfamily I was diverged, suggesting the subfamily I is 
another older evolutionary lineage. Therefore, AMF have 
been symbionts of land plants for at least 450 million years 



old, and the symbiotic Pi uptake is an evolutionarily an- 
cient Pi acquisition strategy for plant life on land [29,48]. 

In the evolutionary process of plants, genome duplica- 
tions were ancient and recurrent [49]. They provide the 
important raw genetic material to adapt to challenging 
environment and increase the diversity of plants. New 
genome sequences and improved analytical approaches 
are clarifying angiosperm evolution and revealing pat- 
terns of differential gene loss after genome duplication 
and differential gene retention associated with evolution 
of some morphological complexity [50]. According to 
the evolution of the PHT1 family, the members of PHT1 
subfamily I and II, which are diverged eailier, were not 
expanded as expected in angiosperms compared with 
the subfamily III and IV. For example, In Arabidopsis, 
which is not host plant of AMF and experienced at least 
three polyploidy events [51], nine members of the PHT1 
family were found, but no members of PHT1 subfamily I 
[52,53]. In Populus trichocarpa, experiencing at least 
two polyploidy events [54], PtPTIO and PtPT8 belong to 
the PHT1 subfamily I, and PtPT8 is a pseudogene [30]. 
In the rice, experiencing one polyploidy event [55], contain 
two nonredundant members of the PHT1 subfamily I, 
OsPTll and OsPT13 [56,57]. And OsPT13 is conserved 
and special across monocotyledons [56]. In the soybean, 
undergoing three WGD events [23-25], three members of 
the PHT1 subfamily I, GrnPHTl;ll, 12 and 13, were found. 
GmPHTl;12 and GmPHTl;13, were a paralog gene pair 
arisen after the Glycine WGD event. And the GmPHTI; 11 
paralog gene was lost after the Legume WGD event and 
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then GmPHTl; 11 became a singleton (Figure 1). Con- 
versely, the GmPHTl- A experienced three rounds of WGD 
events and is composed of 9 members with whole coding 
sequences, plus one pseudogene (GrnPHTl;15). For the 
PHT1 subfamily II, about two members can be found in 
each plants, while members of subfamily III or IV were 
expanded after gene duplication (Additional file 4). Taken 
together, different selection pressures retained different sub- 
groups during the plant evolution. 



Multiple divergences resulted in retention of paralogs 
from one gene family 

Polyploidy is widespread and is a process that recur- 
rently shaped eukaryotic genomes in plant. After gen- 
ome duplication once fixed within species, the three 
possible fates of duplicated genes: neofunctionalization, 
subfunctionalization or nonfunctionalization [6-8,58]. If 
the Ka/Ks value is more than 1.0, gene copies would 
undergo positive selection and have new functions. On 



Fan et al. BMC Plant Biology 201 3, 13:48 
http://www.biomedcentral.com/1471-2229/13/48 



Page 10 of 16 



YFP 


FM4-64 Merged DIC 


GmPHWs+<__ 




0 












/ 

GmPHT1:2 _ 




( ) 












GrnPHT1;3 


















/ 

GmPHT1:12 




J 


(' ' # - >i 










GmPHTt^— 






o 












GmPHT1;9 - 









YFP FM4-64 Merged DIC 


GmPHT1;5 - 










( 




Q 


r ■ 




GmPHT1;U — 




,) 




p 


GmPHT1;13 — 






O 


) 

GmPHTl;6 — 


\ 


) 




GmPHT1;11 — 




■ 







Figure 8 Transient transcription in A. thaliana mesophyll protoplasts of 12 GmPHTI-YFP fusions. The FM4-64 signal is diagnostic for the 
plasma membrane. DIC: differential interference contrast. Scale bar: 10 pm. 



the contrary, subfunctionalization would be expected to 
undergo purifying selection. The Ka/Ks values for all soy- 
bean PHT1 genes were less than 1.0 (Additional file 5F), 
thus, the paralog PHT1 gene pairs were undergoing purify- 
ing selections in the soybean evolution and subfunc- 
tionalized. In the soybean genome, about 75% genes are 
present in multiple copies, and approximately 50% of 
paralogs are differentially expressed and have undergone 
expression subfunctionalization, and only a small propor- 
tion of the duplicated genes have been neofunctionalized 
or non-functionalized [12], suggesting that the main fate of 
duplicated genes were subfunctionalization. 

Although the different functions of Pi transport were 
that they have different affinities, all the published PHT1 
genes display conserved functions of Pi transport. In one 
species, the greatest difference is the divergence of their 
expression profile, which results in their differently func- 
tional sites in the plant. In Arabidospsis, AtPHTl;6 ex- 
presses only in flowers, and both AtPHTl;8 and 9 express 
only in the roots [52], and transcriptions of AtPHTl;5 



are detected in the old tissues and induced by ethylene 
[59]. Additionally, most AtPHTl genes exhibit strong 
expression in several tissues although their expressions 
overlapped to some extent [16]. In angiosperms, some 
members of the PHT1 family, such as MtPT4, OsPTll, 
OsPT13, TaPHTh HvPT8, StPT4, LePT4 and PtPTIO, are 
induced only by AMF, while transcriptions of other mem- 
bers, such as StPT3, OsPTl, OsPT2, OsPT3, OsPT6, 
OsPT9, and OsPTIO are not special to AMF [30,57,60-64]. 
The response of most PHT1 genes to the low Pi are simi- 
lar, but to deficiencies of other nutrient elements, such as 
nitrogen, potassium, iron, and zinc, are different [27,65]. 
Three different paralog gene pairs, show different expres- 
sion patterns or levels under deficient N, K, or Fe condi- 
tions [27], suggesting another subfunctionalization event 
among these paralogs. 

Subfunctionalization can be taken as genetic redundancy 
[66,67]. For example, in Arabidopsis, the Pi uptake rates of 
single mutants, phtl;l and phtl;4, were reduced about 20%, 
but the rate of double mutant was reduced approximately 
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57%, under the low-Pi condition [37]. Mutations in tomato 
(Solanum lycopersicum) StPT4 affected neither mycorrhizal 
Pi uptake nor establishment of the symbiotic pathway, this 
was due to arbuscular-mycorrhizal symbiosis triggering 
expression of other three PHT1 genes [64]. These results 
indicate Pi uptake functionally redundancy. The expres- 
sion profiles of soybean PHT1 (Figure 2), indicates exten- 
sively overlaid, these results suggest redundancy in these 
paralogs as well. 

PHT1, as a plasma membrane protein, consists of two 
regions of six transmembrane domains separated by a 
hydrophilic loop [68,69]. All the 14 soybean PHT1 proteins, 



like other PHTls, contain 12 transmembrane domains 
(Additional file 8). Up to now, the PHT1 family members 
were believed to localize finally to the plasma membrane 
(the function site) after post-translational modifications 
[38,70]. According to our results (Figure 8), the fluores- 
cence signals of most soybean PHTls were detected in the 
plasma membrane. Except the plasma membrane, the 
fluorescence signals of MtPT3 [20] and AtPHTljl [37] 
can be detected in hechtian strands. The subcellular locali- 
zation of GmPHTl;10 in onion epidermal cells was simi- 
lar to that of MtPT3 and AtPHTljl (Figure 9D), but the 
same localization pattern was not observed in its paralog 
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(Additional file 9). Interestingly, the localization of Gm 
PHT1;8 was not at the cytoplasmic membrane but at the 
endoplasmic reticulum (Figure 9B), however its paralog, 
GmPHTl;9, localized to the plasma membrane. 

Subfunctionalization of paralogs beneficial to soybean Pi 
uptake, translocation and remobilization 

To enhance acquisition of external Pi, plants can mor- 
phologically regulate root architecture to enhance the 
root surface/ soil volume ratio, also root architecture is 
closely related to P efficiency [40,71-74]. The number 
and length of lateral roots and the length of primary 
roots increased under the low-Pi condition (Additional 
file 9). At the molecular level, the members of PHT1 
family play important roles in Pi uptake from soil solu- 
tions thus exhibiting robust expression in roots [16,40]. 
According to our results (Figure 4) and others' [27], 
fourteen soybean PHT1 genes expressed in the root were 
up-regulated by low-Pi stress. Moreover, GmPHTl;l, 2, 
5, 7, and 10 had high affinity to Pi. Thus, GmPHTl;l, 2, 
5, 7, and 10 may play important roles in the direct Pi up- 
take from low- Pi soil solutions. 

In addition to direct Pi uptake, symbiotic phosphate 
uptake is an ancestral Pi acquisition strategy for plants, 
meaning that some PHT1 genes are induced by arbuscular 
mycorrhizas [16,29,40,63,75]. In the soybean, the tran- 
scription of three soybean PHT1 genes, GmPT7, 10 and 
11 {GmPHTl; 11, 13, and 12, respectively in this study), 
was induced by arbuscular mycorrhizal fungi [26]. These 
results indicate these three genes have important roles in 
soybean symbiotic Pi uptake. 

After Pi uptake, distribution and remobilization of Pi, 
within the plant, is accomplished through the membrane 
transport systems of the shoots to the sink tissues wher- 
ever symplastic connections are lost [15,40]. Although 
other PHT gene families, such as the PHT2 family [76], 
are involved the process, the PHT1 gene family local- 
ization in the plasma membrane is the most important 
[19,52,77,78]. Based on our results, GrnPHTl;7, 8, 10 
and 12 exhibited stronger expression than their corre- 
sponding paralog genes in the stems, at the seedling and 
flowering stage (Figure 2), suggesting involvement in in- 
ternal Pi transport in the shoots. 

Another developmental signal, senescence, has been 
reported to strongly induced expression of some PHT1 
genes [30,59,79], and most leaf phosphorus is remobilized 
to the seed during reproductive development in soybean 
[80]. For instance, in other plants, the expression of 
PhPTl is up regulated during petunia petal senescence 
[79] and transcript level of Phtl;5 is elevated in the old 
leaves in Arabidopsis [59]. In the soybean, GmPHTl;l ex- 
pression increased in unifoliolates along with developmen- 
tal process and reached the peak during flowering time 
(Figure 2). Moreover, again during flowering time, the 



transcription level of GmPHTl;l was relative to leaf 
ages (Figure 2).This suggests that GmPHTl;l is related to 
the re-utilization of Pi from older leaves. Cotyledons are 
the main phosphorus store tissue and the phosphorus re- 
source tissue at the seedling stage. High expression of 
GmPHTl;8 was detected in the cotyledons at the seedling 
stage (Figure 2), suggesting GmPHTl;8 played an impor- 
tant roles in recycling Pi from cotyledons. 

Different subcellular localizations of GmPHTl pro- 
teins correlated to cellular homeostasis according to our 
findings. Although the majority of GmPHTl proteins lo- 
calized on cytoplasma membrane (Figure 8), similarly to 
PHTls in other plants [81], GmPHTl;8 and GmPHTl;10 
have unique localization patterns. GmPHTl; 10 localized 
to Hechtian strands in addition to the cytoplasma mem- 
brane and cell walls. These Hechtian transporters may 
play critical roles in Pi transport between the cytoplasmic 
membrane and cell wall or between cells [82]. Perhaps, 
GmPHTl;10 had important roles in the cross talk of Pi 
flux or signals amongst the cells. In addition, GmPHTl;8 
localized, exclusively, to the endomembrane system in- 
stead of cytoplasma membrane (Figure 9B). A functional 
auxin transporter, AtPIN5, does not have a direct role in 
cell-to-cell transport but regulates intracellular auxin 
homeostasis and localizes to endoplasmic reticulum (ER), 
unlike other characterized plasma membrane PIN pro- 
teins [83]. Given the function of the AtPIN5, this result 
may indicate GmPHTl;8 has a role in regulating intracel- 
lular Pi homeostasis and metabolism. 

Conclusion 

In the soybean, there were 14 PHT1 genes with full 
whole CDS plus one pseudogene, and they originated 
from four different different ancestors, GmPHTl A, B, C 
and D, before the Gamma WGT events in the soybean 
evolution history. Three polyploidy events expanded the 
members of GmPHTl A. In addition, one tandem dupli- 
cation also increased the members of GmPHTl A after 
the Legume WGD and before the Glycine WGD. The 
retentions of paralog genes of GmPHTIB and C were 
only after the Glycine WGD. GmPHTID contained one 
member, of which paralog gene was lost. Fourteen soy- 
bean PHTls underwent the purifying selection and had 
the conserved function in Pi uptake although they had 
different affinities for Pi, and GmPHTl; IS experienced 
pseudogenization. Expression divergence levels were the 
main style of subfunctionalization of the paralog gene 
pairs. The expression ratios were more than two amongst 
paralog gene pairs in about 76% co-expression tissues. Al- 
though 14 soybean PHT1 genes more strongly expressed 
in the roots under the low Pi condition, the response ex- 
tent were different. Similar subcelluar localizations to the 
plasma membrane were found amongst most soybean 
PHT1 proteins. But GmPHTl;8 was not localized to the 
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plasma membrane but to the endoplasmic reticulum, 
while GmPHTl;10 was localized to Hechtian strands in 
addition to plasma membranes. 

Methods 

Plant materials 

We employed the soybean cultivar (KN18) in all experi- 
ments. Plants were grown in a growth chamber under 
short day conditions (8 hr light/ 16 hr dark) at a tem- 
perature 25°C ~ 28°C. Tissues harvested at two different 
developmental stages, fully expanded unifoliolate leaf 
and flowering onset, were evaluated for GmPHTl ex- 
pression patterns and levels in different tissues. We col- 
lected pods samples 7, 14 and 21 days after flowering. To 
investigate the effect of external Pi concentrations of 
GmPHTl transcription, plants raised in a hydroponic 
culture with an initial Pi concentration of 500 |iM. Once 
unifoliolate leaves were fully expanded, the culture solu- 
tion Pi concentration was changed to 1 |iM, 10 |iM, 
100 (iM, 500 [iM, 1 mM, 2 mM or 5 mM. Solutions were 
refreshed once every two days over the course of one 
week. Each experimental group, containing a minimum 
of five individual plants per group, was harvested, frozen 
in liquid nitrogen and stored at -80°C until required. All 
experiments were repeated three times under the con- 
sistent conditions. 

Identification, cloning and expression vector construction 
of soybean PHT1 genes 

Soybean genome sequences (version 1.09), downloaded 
from Phytozome V 8.0, and used to obtain a set of known 
PHT1 sequences (Additional file 1). Members of the 
GmPHTl family (Additional file 5F) were identified using 
profile hidden Markov models built by HMMER v3.0 [84], 
following the HMMER user guide. The GmPHTl gene 
nomenclature is presented in Additional file 5F. A pseudo- 
gene (GmPHTl; IS) was predicted from syntenic analysis 
using the PGDD or CoGe database. 

Given the sequence similarity between the 14 PHT1 
genes in this study, we designed primers specific to the 5' 
or 3' UTR of each gene (Additional file 5F) for RT-PCR. 
Subsequently, these sequences were used as templates to 
clone the CDSs with corresponding primers (Additional 
file 5E) into an entry vector pGWC [85], next genes were 
recombined into an appropriate yeast expression vector, 
pYES-DEST52 (Invitrogen), and plant expression vectors, 
pEXSG-YFP-GW, with Gateway technology. 

Bioinformatic analysis 

To identify the intra-genome (G. max) or cross-genome 
(G. max and M. truncatula) syntenic relationships we 
employed, SynMap (http://genomevolution.org/CoGe/Syn 
Map.pl). To investigate the synteny of blocks containing 
PHT1 genes, homology data derived from CDS-CDS 



comparisons made using Blastz, with an E-value cutoff of 
le-5, other parameters were default or recommended. 
The nine members of the PHT1 gene family was identified 
(Additional file 5B) in M. truncatula as above. 

Transmembrane (TM) domains of PHT genes were 
predicted by TopPred 2 (http://bioweb.pasteur.fr/seqanal/ 
interfaces/toppred.html) [86]. PLACE [31] (http://www. 
dna.affrc.go.jp/PLACE/signalscan.html) was employed to 
scan the ds-acting elements of the predicted promoter se- 
quences for every GmPHTl genes. After alignment of the 
full GmPHTl genes' CDS by ClustalW (http://www.ebi.ac. 
uk/Tools/msa/clustalw2/), Ka (non-synonymous substitu- 
tions per non-synonymous site) and Ks (synonymous sub- 
stitutions per synonymous site) of the paralog genes was 
computed by DnaSP v5 (Additional file 5F) [87]. 

Yeast manipulations 

The yeast Pi uptake-defective mutant PAM2 (zlpho84 
zlpho89) [36] was employed to identify the Pi transport 
activities of the 14 GmPHTl genes. All GmPHTl yeast 
recombinant expression vectors carrying PHT1 CDS were 
transformed into PAM2. Transformed cells grew to loga- 
rithmic phase in a synthetic liquid medium (SM, 1 liter: 
5.9 g YNB (CYN0804, For Medium), 0.77 g mixture of 
amino acid without Ura (Clontech), 2% raffinose, 6 mM Pi, 
pH5.8). The cells were harvested when enter the log phase, 
washed with Pi-free medium, and re-suspended in the same 
medium to different concentration. For yeast mutant com- 
plementation experiments, the yeast cells dilution were 
plated onto solid, induced or non-induced, medium (1 liter: 
5.9 g YNB, 0.77 g mixture of amino acid without Ura, 2% 
galactose or glucose, 2% agar (#05038, sigma), 10 (iM Pi, 
pH 6.5). Potassium was supplemented with equivalent 
KC1, and 0.04% bromocresol purple was used as a pH indi- 
cator [19], plates were incubated at 30°C, for 3 days. We 
performed Pi uptake experiments using 32 Pi as previously 
reported [20]. Yeast cells grew in liquid, non-induced 
medium (1 mM Pi) for 6 hr, cells were harvested, and 
then washed with the Pi-free medium 3 times. Next, yeast 
cells were grown in liquid, induced medium, without Pi, for 
4 hours, harvested and washed with water 3 times. After 
the final wash, cells were resuspended at 200 mg cells/ml -1 . 
Cell suspension 30 |iL was added to YNB medium (570 |il), 
containing 25 mM sodium citrate (pH4.5), 2% glucose and 
appropriate concentration gradient of Pi (10 |iM, 50 |iM, 
100 (iM, 300 (iM, 500 (iM or 1000 (iM). Radioactive 32 Pi 
was added to the yeast solution at a final concentration of 
0.125 (iCi, and cells were incubated at 30°C with gentle 
agitation for 3 min. Immediately, we added 4 ml of ice-cold 
stop solution (25 mM sodium citrate buffer, pH 4.5) trans- 
ferred onto glass fiber filters and washed with an additional 
4 ml of stop solution. A scintillation spectroscopy measured 
the samples radioactivity. All experiments were repeated 
three times with similar results. The kinetic data was 
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analyzed by nonlinear regression with GraphPad Prism 5 
software. 



Additional file 9: The appearance of soybean roots in plants 
subjected to different Pi concentration conditions. 



Total RNA isolation and quantitative reverse 
transcription-PCR (RT-qPCR) 

The procedures used for RNA extraction and cDNA syn- 
thesis are as described by Hu, et al [88]. All expression ex- 
periments were repeated a minimum of three times. The 
primers for the 15 PHT1 genes examined by qRT-PCR are 
listed in Additional file 5F and 14 of the primer pairs had 
an efficiency greater than 90% as determined by LinReg 
PCR (http://LinRegPCR.HFRC.nl) [89]. GmSKIP16, and 
GmUNKl were used as reference genes for all qRT-PCRs 
[88]. The relative expression was computed following the 
formula 2 (Cta Ctb) , where C t a and Qb are the average Q 
values of the reference and target genes, respectively. 

Subcellular localization analysis 

Transient expression of YFP tagged GmPHTl were 
performed in Arabidopsis mesophyll protoplasts through 
PEG-calcium transfection [90] and in onion epidermal 
cells by bombardment [20] . Experiments were carried out 
to analyze subcellular localizations of 14 GmPHT proteins 
and to investigate the cellular localization of GmPHTl 
proteins in vivo. Specific subcellular organelles markers, 
plasma membrane (FM4-64) [91] and endoplasmic retic- 
ulum (ER-mRFP) [39], were selected. Section Z-series im- 
ages were collected at different intervals throughout the 
specimens to facilitate observation. Twenty to thirty cells 
were imaged for each experiment. Post-acquisition image 
analysis and processing was performed using MBF ImageJ, 
version 1.46. 
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